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Abstract: We present a composite vector selection method for an effective electronic nose 
system that performs well even in noisy environments. Each composite vector generated 
from a electronic nose data sample is evaluated by computing the discriminant distance. 
By quantitatively measuring the amount of discriminative information in each composite 
vector, composite vectors containing informative variables can be distinguished and the 
final composite features for odor classification are extracted using the selected composite 
vectors. Using the only informative composite vectors can be also helpful to extract better 
composite features instead of using all the generated composite vectors. Experimental results 
with different volatile organic compound data show that the proposed system has good 
classification performance even in a noisy environment compared to other methods. 

Keywords: distance discriminant; composite vector; odor classification; sensor array; 
electronic nose 



1. Introduction 

An electronic nose is an instrument intended to identify the specific components of an odor. While 
human olfactory sensing is prone to be easily fatigued, an electronic nose has the merit of consistently 
detecting odors, including those harmful to the human body [1-4]. Electronic nose systems are used 
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for various purposes, such as quality control applications in the food and cosmetics industries, the 
detection of odors regarding specific diseases for medical diagnosis, and the detection of gas leaks for 
environmental protection [3,5-9]. 

An electronic nose consists of a sensor array for chemical detection, which is made of polymer 
carbon composite materials, and a classifier based on various pattern recognition techniques. Hence, 
the sensitivity of a sensor array and the design of a classifier are crucial factors for the improvement of 
electronic noses. There are several types of sensor arrays for electronic noses [10-15]. Among them, 
conducting polymer composites, intrinsically conducting polymer and metal oxides are most commonly 
used for sensing materials in conductivity sensors. Once volatile organic compounds (VOC) are adsorbed 
on the sensor surface, a specific response is obtained as a numerical variable by an electronic interface. 

In classification problems, the processes can be decomposed into a few steps: feature selection, feature 
extraction and choosing a classifier. Various static or dynamic information for odor classification can be 
obtained from the sensor response curve [16-18]. In [17,18], five features, which are the relative change 
in resistance, the curve integral both over the gas adsorption and desorption process and the phase space 
integral, again over adsorption and desorption, are extracted from the response curves of six metal oxide 
sensors. The analysis of the dynamic features of metal oxide sensors was presented to classify four 
types of volatile compounds, namely acetone, acetic acid, acetaldehyde and butyric acid [16] and active 
analyses were proposed to deal with gas mixture problems [19,20]. In [21-23], various compensation 
methods were proposed to solve the drift problem causing a random temporal variation of the sensor 
response under identical conditions. 

The features extracted from the sensor array are fed into a classifier such as the NN (Nearest Neighbor 
rule) [2] or S VM (Support Vector Machine) [9] for prediction of the class label. In order to improve the 
performance of a classifier, various feature extraction methods can be used for discriminant analysis and 
dimensionality reduction [24-27]. Since each method has its pros and cons, an appropriate method must 
be selected considering the properties of the data and the problem that needs to be solved. For instance, 
the PCA (Principal Component Analysis) method [28] does not utilize class information of data samples, 
and finds the projection vectors that correspond to a set of large eigenvalues of the total scatter matrix of 
data samples. Thus, it is more appropriate to use the PCA method for data representation, rather than data 
classification. On the other hand, the LDA (Linear Discriminant Analysis) method [29] seeks the linear 
transformation that maximizes the ratio of the between-class scatter matrix (5^^) and the within-class 
scatter matrix (Sw)- While it gives good performance for classification problem, it suffers from the SSS 
(Small Sample Size) problem [29] in case of high-dimensional data. 

The above methods extract features based on covariance matrices which differ depending on 
their objective functions. Unlike this, some methods such as MatFLDA (Matrixized Fisher Linear 
Discriminant Analysis) [30], 2DFLD (Two-Dimensional Fisher Linear Discriminant) [31], or CLDA 
(Composit LDA) [32,33], use a different type of covariance matrix, which is called an image-covariance 
matrix. The elements of an image covariance matrix are defined as the expectation of the inner products 
of predefined vectors. These methods are often effective for data that has a large correlation between 
primitive variables or high-dimensional data such as the electronic nose data [34] because they utilize 
information about the statistical dependency among multiple primitive variables and result in a saving in 
computational effort. 
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The composite features are extracted by using the covariance of composite vectors composed of a 
number of primitive variables in various shapes of windows. However, it is hkely that there is redundancy 
between composite vectors when generating composite vectors. Moreover, If there are problems in 
the data collection process, or when attributes among the collected primitive variables that have no 
association with solving the classification problem are included, the feature extraction results do not 
result in optimal solutions and degrade the classification performance [24]. Therefore, distinguishing 
good composite vectors containing informative primitive variables before the feature extraction process 
is important to extract better composite features for classification. 

In this paper, we propose a method to select the composite vectors which contain informative variables 
in an electronic nose data sample measured by a sensor array. We measure the amount of discriminative 
information that each composite vector has, based on the discriminant distance [35] for each composite 
vector and rank ricf composite vectors in descending order according to its discriminant score. The 
informative composite vectors are distinguished before the process of feature extraction, and then the 
composite features to be used for the classifier are extracted from the only selected composite vectors. 
There are potential benefits in employing this selection process such as reduction in computation, storage 
and processing time in addition to prediction performance improvement. In the process of extracting 
composite features, the computational effort increases in the order of v'^ as the number of composite 
vectors (v) increases. This implies that the computational complexity can be significantly reduced by 
the proposed method. By using a classifier in an electronic nose with the extracted composite features, 
we design the robust electronic nose system to noisy environments (Figure 1). The experimental results 
show that the proposed method gives very good classification results even in a noisy environment. 

Figure 1. The schematic diagram of our electronic nose system. 
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The rest of this paper is organized as follows. Section 2 introduces a discriminant distance and 
presents how to select composite vectors based on their discriminant scores. Section 3 explains the 
acquisition of electronic nose data and how composite features are extracted using the selected composite 
vectors for odor classification. Section 4 describes the experimental results and the conclusions follow 
in Section 5. 
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2. Composite Vector Selection Based on Discriminant Distance 

Composite vectors can be defined in various ways depending on the shape of a window. The data 
acquired from a sensor array is stored in an n-dimensional vector, and a composite vector G 
consists of /(/ < n) primitive variables. Composite vectors are generated by shifting a window as 
much as s, which is usually smaller than the length of a composite vector, and thus composite vectors 
overlap with each other, as shown in Figure 2. The correlation between neighboring variables can be 
better utilized in the use of the covariance of composite vectors. The number of composite vectors v is 
[^J + 1, where [ J is the floor operator, which gives the largest integer value that is not greater than the 
value inside the operator. Then, the fc-th data sample is represented by = [xi(/c), ..,x^(/c)]^ G My^\ 
which is a set of composite vectors. The final composite features for classification are extracted by using 
the covariance of these composite vectors [36]. 

Figure 2. Constructing composite vectors. 
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However, the overlapped composite vectors as in Figure 2, which may result in redundancy in 
extracting composite features. Therefore, it needs to find out the composite vectors that promise 
good class separability among different classes as well as make the samples in the same classes as 
close as possible. Motivated from the method to select individual variables based on a distance 
discriminant [35], we define the distance within classes (D\y) and the distance between classes (D^b) 
to compute the discriminant distance for the i-ih composite vector x^(/c) = 
as follows: 

3=1 1=1 

Here, m\, and A^^ are the j-th element of the mean of the class q, the j-th element of the mean of 
whole training samples and the number of samples in the class q, respectively. Then, the discriminant 
distance for the i-th composite vector is computed by D^^ — f3D\y, which reflects the discriminative 
information of each composite vector. The value of /3 can be determined depending on the distribution 
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of data samples. For example, in case of the distribution which has good class separability but large 
variance in the same class, small penalty on D\y will be better. By investigating the performance 
with respect to /5, we set /3 as 2. For composite vector selection, we define the measure vector as S G 
whose element 5^^ = D\y — /3D^q. Finally, ricf composite vectors corresponding to larger SiS are selected 
for extracting the final composite features. 

3. Design of Electronic Nose System 

3.1. Acquisition of Electronic Nose Data 

The sensor array used in our system was implemented by dispensing a CB polymer composite-solvent 
solution in a micromachined gas sensor array chip [15]. While the polymer composite has some 
drawbacks such as sensor drift, limited sensor life, or sensitivity to temperature and humidity it offers 
many advantages over other materials when used as gas sensor, e.g., the wide range of polymetric 
materials, inexpensiveness, stable operation at room temperature, and less power consumption, 
etc. [10] The sensor array consists of 16 separate sensors with an interdigitated electrode, microheater, 
and micromachined membrane in each channel for further temperature-controlled measurement 
applications (Table 1). The resistance change of each polymer composite film was monitored in response 
to the incorporation of chemical vapor. The resistance change of polymer composite film was amplified 
by 20 times and recorded every 0.1 s (Figure 3). Measurement consisted of three steps of stabilization 
(30 s), exposure (60 s), and purge (110 s). It was performed after the sensor array was placed into the 
chamber and and the signal of resistance was stabilized. Then, the flow control unit in our system allows 
the vapors to flow in at desired concentration during about 60 s and afterward flushes the remainder 
by air flow for about 110 s [37]. The measured data are collected in PC using data acquisition (DAQ) 
board DAQ6062E and Lab VIEW (National Instrumentation, USA). The voltage-divider operated in the 
range from - lOVto+lOV and gains of 16 identical amplifiers were set to 10 (output/input voltage) for 
maximum DAQ resolution [15]. 



Table 1. The list of 16 CB polymer composites used in the sensor array. 



Number 



Polymer LD. 



Ch 1 
Ch2 
Ch3 
Ch4 
Ch5 
Ch6 
Ch7 
Ch8 
Ch9 
Ch 10 
Ch 11 
Ch 12 
Ch 13 
Ch 14 
Ch 15 
Ch 16 



Poly(methyl methacrylate) 

Polyvinylpyrrolidone 

Poly(vinyl acetate) 

Poly(ethylene oxide) 

Polycaprolactone 

Poly(4-methylstyrene) 

Poly(styrene-co-methyl methacrylate) 

Poly(ethylene-co-vinylacetate) 

Poly(bisphenol A carbonate) 

Poly(4-vinyl pyridine) 

Poly(vinyl butyral)-co-vinyl alcphol-co-vinyl acetate 
Poly(vinyl stearate) 
Ethyl cellulose 

Polystyrene-black-polyisoprene-black-polystyrene 
Hydroxypropyl cellulose 
Cellulose acetate 
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Figure 3. Typical time-responses of 16 channel sensor array with respect to inflow of acetone 
vapor at 5,000 ppm [2]. 
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3.2. Extraction of Composite Features from Selected Composite Vectors 

It is very effective for classifying patterns if the within-class variance is small while the between-class 
variance is large. Similar to LDA, a discriminant analysis using the covariance of composite 
vectors is derived from the between-class covariance matrix (Cb) and the within-class covariance 
matrix (Cw) [29]. Assume that each training sample belongs to one of c classes, and that there are 
A^^ samples in the class q. Let X\k) G M^^/^^ denote the set of the selected composite vectors of the 
k-th sample. Then, Cw ^ M^^/>^^^/ is defined as 

Cw = E^a^ E(^'(^) - M^){X'{k) - M,)^} (2) 

i=l kEci 

where = ^ ^x'{k)eci X\k). Here, pi is a prior probability that a sample belongs to class q. 
Cb G W-f'^'-f is also defined as 

c 

Cb = Y^Pi^^i - - (3) 

1=1 

The image covariance can be also interpreted from another point of view, not from the view of the 
composite vectors. If letting and m be column vectors of X\k) and M, respectively, Cw and Cb 
can be rewritten as 




3=1 1=1 

Xj{k) consists of the j-th elements in each of the selected composite vectors, which is sampled from 
X\k) with regularly varying intervals. This is the similar effect that generates / times more data samples 
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of smaller size. The increase of the number of data samples will provide a robust performance to the 
variation caused by the noise. 

Composite features are obtained by linear combinations of the composite vectors and each feature 
is a vector whose dimension is equal to the dimension of the composite vector. For composite feature 
extraction, the projection matrix W is found by maximizing the following objective function: 



W = arg max 



W^CbW\ 



(5) 



The set of composite features for Y{k) is obtained by projecting X\k) into the projection matrix W as 

Y{k) = W^X\k), A; = 1, 2, . . . , A^, (6) 

where Y{k) G M^^^ has m composite features [yi(fc) . . . y^l^)]^- 

The length of the window (/), the number of composite features (m) and the step size of the shift (s) 
are important parameters that influence the classification performance. We investigated the classification 
rates with respect to /, m and s. Table 2 shows the classification rates with respect to / and m. In this 
case, we set s = 111 as in [32]. As can be seen in Table 2, the classification rates are not sensitive to / if m 
is properly decided. We set / and m to 400 and 25, respectively. Then, we investigated the classification 
rates with respect to s. As can be seen in Table 3, the classification rates are not sensitive to s and the 
classification rate of 5 = 200 was slightly better than those of other s values. Therefore, we set s to 200. 
Also, in order to find the optimal number of the selected composite vectors, we checked the classification 
rates for the electronic nose data by increasing the number of selected composite vectors nc/. As a result, 
we set the number of selected composite vector Ucf to 150. 

Table 2. Classification rates with respect to / and m. 



m 

I ^\ 


1 


3 


5 


11 


16 


21 


26 


31 


36 


100 


67.5 


91.9 


98.1 


98.1 


98.1 


98.1 


98.1 


98.1 


98.1 


200 


72.5 


91.9 


98.1 


98.1 


98.1 


98.1 


98.1 


98.1 


98.1 


400 


78.8 


94.4 


98.8 


98.1 


98.8 


98.1 


98.8 


98.8 


98.8 


800 


71.3 


95.6 


98.8 


98.8 


98.1 


98.1 


98.1 


98.1 


98.1 


1600 


64.4 


75.0 


98.8 


97.5 


98.1 


98.1 


98.1 


98.1 


98.1 



Table 3. Classification rates with respect to s. 



s 


50 


75 


100 


125 


150 


175 


200 


225 


250 


Classi. rate 


98.1 


98.1 


98.1 


98.1 


98.1 


98.1 


98.1 


98.8 


98.1 



The overall procedure of our system can be summarized as follows (Figure 4): 

(1) Generate v composite vectors x^(/c), i = 1, .., ^ G from an e-nose data sample by shifting the 
/ length of window as much as the step size of shift {s), 

(2) For each composite vector x^(/c), compute the distances within- {D\^) and between-classes (i^^). 

(3) Compute the discriminant distance for the z-th composite vector by Si = D^^ — /3D\^. 

(4) Construct the measure vector S G whose element 5^^. 
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(5) Select ricf composite vectors corresponding to larger SiS, 

(6) Extract the final composite features with the only 
selected composite vectors. 



Figure 4. Overall procedure of the proposed electronic nose system. 




4. Experimental Results 

The VOC measurement data consists of 8 classes, which are acetone, benzene, cyclo-hexane, ethanol, 
heptane, methanol, propanol, and toluene [15]. For each class, we obtained 20 samples, and thus 
the total data set contains 160 samples. Figure 5 shows the distribution of the data samples in the 
subspace consisted of two principal component axes. The e-nose sensor used in this experiment 
measures vapors with a speed of 10 Hz, which corresponds to a sampling rate of 2,000 points per 
200 s. Each data sample was measured through 16 channel over 2,000 time points and was represented 
as a 16 X 2,000 matrix. Then, the raw data was transformed into the 32,000-dimensional vector by using 
the lexicographic ordering operator for feature extraction (Figure 2). 

When setting / and s as 400 and 200, respectively, the total 159 composite vectors can be generated 
from a 32,000-dimensional data sample. We measured the discriminant scores of each composite vector 
by using the proposed method. Out of the total 159 composite vectors, we represented the composite 
vectors with top 60 and 120 scores as T and the rest as '0' (Figure 6). In Figure 6, we can see that the 
'stabilization' and 'purge' periods contain the discriminative information for odor classification as well 
together with the 'exposure' period. 
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Figure 5. Distribution of the data samples in the Principal Component Analysis (PCA) 
feature space. 
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Figure 6. Distribution of the selected composite vectors, (a) 60 composite vectors, 
(b) 120 composite vectors. 



in 

0 50 . 100 150 

composite vector 

(b) 

We compared the classification performance of the proposed method (CVS) with that of the LDA 
method [26], the FF (Feature Feedback) method [38], the CC-PCA (Component Correction by PCA) 
method [39], and CC-CPCA (Component Correction by Common PCA) method [22]. We applied 
PCA after CC-PCA and CC-CPCA, which slightly increased their classification rates. Each method 
was evaluated using an 8-fold cross validation strategy [40]. In this scheme, the data is first randomly 
partitioned into 8 equally sized folds. Then, 8 iterations of training and testing are performed, within 
each of which a different fold of the data (20 data samples) is used for testing, while the remaining 
7 folds (140 data samples) are used for training. The nearest neighbor rule was used as a classifier and 
the I2 nor was used to measure the distance between two samples. We repeated this test 8 times and 
computed the average classification rate. All the data samples are normalized using the mean and the 
variance of the training set. 

Since noise is likely to occur in sensing data, we added Gaussian noise with a standard deviation 
3 to each data sample, and evaluated the robustness of each method to the noise (Figure 7). 
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Figures 8 show examples of the data with or without Gaussian noise and the classification rates of each 
case, respectively. 

Figure 7. Electronic nose data w/o and with Gaussian noise, (a) Electronic nose data without 
noise, (b) Data with Gaussian noise (std 3). 
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Figure 8. Classification rates for the electronic nose data, (a) Classification rates for 
the original electronic nose data, (b) Classification rates for the electronic nose data with 
Gaussian noise (std 3). 
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For the original data, all the methods classified each vapor well with high classification rates as can 
be seen in Figure 8a. When Gaussian noise is added, the classification rates of the other methods 
decreased rapidly (Figure 8b). In contrast, the proposed method gave consistently high classification 
rates of 97.3% ^ 98.4%, which showed that our system performs reliably in a noisy environment. 
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5. Conclusions 

We have presented a method to select useful composite vectors for odor classification. Composite 
vectors, which are generated from an electronic nose data sample by shifting the window, are likely to 
contain redundant information for extracting discriminant features and some noise occurred in measuring 
with a sensor array. Thus, we evaluated the class separability power of each composite vector based on 
a discriminant distance and selected the only composite vectors with large discriminative information. 
This selection process has the advantage to holistically view the electronic nose response by its focus 
on the extraction of informative response characteristics. The proposed composite vector selection 
method not only reduced the computational complexity, but also helped to extract better features. Since 
extracting good features not only relieves the influence of noise in the measured data, but also improves 
the performance of a classifier such as SVM and NN. When using SVM without any feature extraction, 
while the classification rate for the original electronic nose data was 98.0%, the classification rate 
dropped to 51.2% for the data with Gaussian noise. On the contrary, NN with the features extracted 
by the proposed method gave the classification rates of 99.8% and 98.4% for the same data sets, 
respectively. Hence, the proposed method can be utilized together with algorithms of other classification 
processes such as feature selection or classifier design and improve the performance of the overall 
classification system. 

In this paper, we focus on the classification between gas data classes without interference. It is also 
important to classify the data which contains combinations of gases, different concentration, etc, in 
e-nose data. In near future, we will deal with the interference between gases and gas combinations. 
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